Outlier Document Filtering Applied to the Extractive Summarization

نویسنده

  • Metin Turan
چکیده

Summarization requires selection of the more informative sentences within a set of documents. Generally, process assumes the document set includes related topics to a subject. However, some of the documents may be outlier and the effect of an outlier document might affect the success of extractive summary. Research is focused on filtering documents at the extraction stage these are outlier. Extraction finds the outlier documents far distance from representative document set word vector (DSWV). DUC 2006 data set is used for tests. System summaries are compared with the systems generated by DUC participants. Results points out that filtering outlier documents overwhelm all the systems fairly. Keywords— Document Processing, Extractive Summarization, Outlier Detection, Similarity Measure

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

ISSN 1392–124X (print), ISSN 2335–884X (online) INFORMATION TECHNOLOGY AND CONTROL

We studied outlier document filtering (ODF) for extractive sentence summarization. Our results are superior compared to the average of the participant systems’ using DUC 2006. Furthermore, we add extractive paragraph summarization to the same system. It is surprising that the results are nearly the same for ROUGE metrics. Although extractive paragraph summarization has a better performance for ...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Empirical analysis of exploiting review helpfulness for extractive summarization of online reviews

We propose a novel unsupervised extractive approach for summarizing online reviews by exploiting review helpfulness ratings. In addition to using the helpfulness ratings for review-level filtering, we suggest using them as the supervision of a topic model for sentence-level content scoring. The proposed method is metadata-driven, requiring no human annotation, and generalizable to different kin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014